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Abstract 

Background: Mycoplasma hyopneumoniae is the causative agent of porcine enzootic pneumonia (EP), a mild, 
chronic pneumonia of swine. Despite presenting with low direct mortality, EP is responsible for major economic 
losses in the pig industry. To identify the virulence-associated determinants of M. hyopneumoniae, we determined 
the whole genome sequence of M. hyopneumoniae strain 168 and its attenuated high-passage strain 168-L and 
carried out comparative genomic analyses. 

Results: We performed the first comprehensive analysis of M. hyopneumoniae strain 168 and its attenuated strain 
and made a preliminary survey of coding sequences (CDSs) that may be related to virulence. The 168-L genome 
has a highly similar gene content and order to that of 168, but is 4,483 bp smaller because there are 60 insertions 
and 43 deletions in 168-L. Besides these indels, 227 single nucleotide variations (SNVs) were identified. We further 
investigated the variants that affected CDSs, and compared them to reported virulence determinants. Notably, 
almost all of the reported virulence determinants are included in these variants affected CDSs. In addition to 
variations previously described in mycoplasma adhesins (P97, P102, P146, P159, P216, and LppT), cell envelope 
proteins (P95), cell surface antigens (P36), secreted proteins and chaperone protein (DnaK), mutations in genes 
related to metabolism and growth may also contribute to the attenuated virulence in 168-L. Furthermore, many 
mutations were located in the previously described repeat motif, which may be of primary importance for 
virulence. 

Conclusions: We studied the virulence attenuation mechanism of M. hyopneumoniae by comparative genomic 
analysis of virulent strain 168 and its attenuated high-passage strain 168-L. Our findings provide a preliminary 
survey of CDSs that may be related to virulence. While these include reported virulence-related genes, other novel 
virulence determinants were also detected. This new information will form the foundation of future investigations 
into the pathogenesis of M. hyopneumoniae and facilitate the design of new vaccines. 
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Background 

Mycoplasma hyopneumoniae causes porcine enzootic 
pneumonia, which is a mild, chronic pneumonia of swine 
[1]. This highly infectious organism has a worldwide 
distribution. The primary mycoplasmal infection often 
becomes complicated by secondary bacterial and viral 
infections [2], resulting in more severe lung lesions and 
production losses. Relative control has been achieved 
through active vaccination programs, but porcine enzootic 
pneumonia continues to be a major economic problem in 
the swine industry. While progress has been made in 
understanding the molecular basis of some Mycoplasma 
diseases [3], advances in M. hyopneumoniae research have 
been hampered by its fastidious growth condition and the 
lack of genetic tools and transformation protocols. To 
date, few virulence determinants or virulence-associated 
determinants have been identified. Attachment to the re- 
spiratory epithelium is a prerequisite for host colonization 
and is mediated by the membrane protein P97 [4]. This 
protein is located on the outer membrane surface, and its 
role in adherence has been firmly established. The general 
region of P97 that mediates adherence to swine cilia is 
thought to be the Rl region, near the C- terminus of the 
protein [5]. To bind cilia, a minimum of eight tandem 
copies of the pentapeptide sequence (AAKPV/E) in Rl are 
required [5]. Although the function of R2 in vivo is un- 
known, both it and Rl are required to bind heparin [6]. 
The P97 genes of M. hyopneumoniae strains 7448, 232, 
and J code for proteins with 10, 15, and 9 of the previously 
described Rl repeating units (AAKPV/E), respectively; all 
three strains had more than the minimum number of tan- 
dem copies (8 tandem copies) required for cilium binding 
[7]. Moreover, monoclonal antibodies F1B6 and F2G5, 
which both react predominantly with P97 [4,5], only 
partially block adherence of M. hyopneumoniae to recep- 
tors on epithelial cell cilia [8]. These observations indicate 
that molecules other than P97 play a role in facilitating 
adherence of M. hyopneumoniae to swine cilia. Compara- 
tive transcriptomic and proteomic studies are also per- 
formed to study transcriptional changes that occur during 
disease and investigate differentially expressed proteins 
in pathogenic and non-pathogenic strains [9-11]. Several 
M. hyopneumoniae proteins, including immunodominant 
proteins (P36 [12], P46 [13], and P65 [14]), adhesin- 
related proteins (P102 [15], P146 [16], P159 [17], P216 
[18], and LppT [16]), and a 54-kDa cytotoxic factor [19], 
have been characterized; however, the biological functions 
of these proteins in pathogenesis are not well understood. 

Comparative genomic analysis has previously revealed 
mechanisms of M. hyopneumoniae pathogenicity [7] and 
predicted unidentified virulence factors, including genes 
involved in secretion and/or traffic between host and 
pathogen cells, or with evasion and/or modulation of the 
host immune system [20,21]. In 2005, Vasconcelos et al 



sequenced a pathogenic and a non-pathogenic strain of 
M. hyopneumoniae and performed a comparative gen- 
omics approach to identify putative virulence genes [7]. 
They identified various CDSs that could be considered 
candidate virulence genes, including cilium adhesin 
homologs, lipoproteins, and other components which 
might contribute to virulence [7]. However, comparative 
genomic analysis of a virulent M. hyopneumoniae strain 
versus its attenuated strain is lacking. 

The need to control the spread of M. hyopneumoniae 
prompted the development of live attenuated vaccine 
strains. M. hyopneumoniae strain 168-L has been exten- 
sively used as vaccine against M. hyopneumoniae in 
China [22,23]. This attenuated vaccine strain is derived 
from the virulent parent strain 168. Strain 168 was origin- 
ally isolated in 1974, from an Er-hua-nian pig (a Chinese 
local breed very sensitive to M. hyopneumoniae) with typ- 
ical clinical and pathogenic characteristics of mycoplasmal 
pneumonia of swine (MPS) [24]. This field strain was 
gradually attenuated by more than 300 continuous pas- 
sages through KM2 cell-free medium (a modified Friis 
medium) and the 380th passage was named strain 168-L. 
Currently, the genetic basis for the attenuation of viru- 
lence in 168-L is poorly understood. 

To gain new insight into the components that con- 
tribute to virulence and the mechanisms by which M. 
hyopneumoniae causes disease, we sequenced the gen- 
omes of strains 168 and 168-L. This allowed us to per- 
form the first comprehensive analysis of virulent and 
attenuated strains, and identify CDSs that may be related 
to virulence. We further investigated these putative viru- 
lence related CDSs and compared them with reported 
virulence determinants. Notably, almost all reported 
virulence determinants were found in putative virulence 
related CDSs. Besides the reported virulence determi- 
nants, other candidate virulence genes were also identi- 
fied. The study of these candidate virulence genes and 
their corresponding products will be important to better 
comprehend the pathogenesis of M. hyopneumoniae. 

Results and discussion 

Genomic features of M. hyopneumoniae 168-L and its 
global comparison with pathogenic strain 168 

The complete genome of M. hyopneumoniae 168-L con- 
sists of a 921,093 bp (GC content 28.46%) single circular 
chromosome (GenBank accession number CP003131). A 
total of 689 protein-encoding genes were predicted. The 
average protein size is 378 amino acids and the mean 
coding percentage is 84.8%. Approximately 51% of genes 
were assigned to specific functional clusters of ortholo- 
gous groups (COGs), and 28% were assigned an enzyme 
classification (EC) number (Figure 1). Comparison with 
the M. hyopneumoniae 168 genome (GenBank accession 
CP002274) revealed a highly conserved gene content 
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Figure 1 Genome architecture. The dnaA gene is at position zero. Moving inside, the first circle shows the genome length (units in M. 
hyopneumoniae); the second and the third circles show the locations of the predicted CDSs on the plus and minus strands, respectively, which 
were color-coded by COG categories (the color codes for the functional assignments are shown in the key); the fourth circle shows tRNAs 
(purple) and rRNAs (red); the fifth circle shows the centered GC (G+C) content of each CDS (blue: above mean and cyan: below mean); and the 
sixth circle shows the GC (G+C) skew plot (red: above zero and pink: below zero). Circles 7-10 show comparative amino acids analysis of 168 
with amino acids identities color-coded according to the similarity shown in the key to strains 168-L (seventh circle), 232 (eight circle), J (ninth 
circle), 7448 (tenth circle). 



and order between the two strains. The 168-L genome is 
4,483 bp smaller than that of 168 (925,576 bp), because 
there are 60 insertions and 43 deletions (indels; inser- 
tions and deletions of any size) in 168-L relative to 168 
(see Additional file 1: Table SI; Additional file 2: Table 
S2; Additional file 3: Table S3). Among these, 33 indels 
are located in predicted CDSs, and 70 are in noncoding 
regions. Besides these indels, 227 single nucleotide varia- 
tions (SNVs) were identified between 168 and 168-L 
(Additional file 4: Table S4). While 31 SNVs were mapped 
to intergenic regions, 196 were in coding regions, indu- 
cing amino acid substitutions, frame shifts, and transla- 
tional stops. 

ISMHpl -Related genetic variations between 168 and 168-L 

The difference between the genome sizes of strains 168 
and 168-L is mainly due to differences in the duplication 
of Insertion Sequence (IS) elements. IS elements are dis- 
tributed stochastically across the entire genome of both 
strains. The 168-L genome contains nine complete and 
one disrupted IS elements, which is almost identical to 
that of 168 except for slight differences in ISMHpl. 



There are nine complete copies of ISMHpl in 168-L, 
but 12 copies in 168. The difference in ISMHpl copy 
number between 168 and 168-L is due to three complete 
ISMHpl deletions (located 690 kb, 870 kb, and 900 kb 
from oriC) and one complete ISMHpl inversion, which 
was originally located at 378 kb from on'C, but was 
inverted in 168-L (1656 bp, located 372 kb from oriC) 
(Figure 2a). 

Other than the IS elements, notable large-scale genomic 
differences were also indicated. Compared to strain 168, a 
genomic deletion of approximately 1.36 kb (locus 1: be- 
tween MHP168L_311 and MHP168L_729) was identified, 
which had been substituted with an approximately 2.32 kb 
novel insertion sequence (locus 2) that was joined to a 
complete ISMHpl element in 168-L (Figure 2b). This 
168-L-related insertion fragment was also observed in 
strains 7448 and 232. 

Molecular analysis of integrative conjugative element (ICE) 

The integrative conjugative element (ICE) is a mobile 
DNA that is probably involved in genomic recombination 
events and in pathogenicity. The ICEH elements are more 
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Figure 2 Alignment of the whole genomes and inverted regions, (a) Alignment of the M. hyopneumonioe 168 and 168-L genomes, 
(b) Alignment of the inverted regions of 168 relative to 168-L The gray bars represent the forward and reverse strands. Green triangles represent 
ISMHP1 elements. DNA BLASTN alignments (BLASTN matches) between the two sequences are indicated by a red (same strand) or blue 
(opposite strand) line. 



divergent than the typical similarity of other chromosomal 
locus inM hyopneunomiae [25], suggesting an accelerated 
evolution of these constins [26]. During a survey of spe- 
cific sequences, a specific 26.9-kb region with similarity to 
the integrative conjugal element of M. fermentans (ICEF) 
[27] was found in strain 168, which was designated ICEH 
(for integrative conjugal element of M. hyopneumoniae) . 
Unlike ICEH in strains 7448 and 232, which consist of 
nineteen and twenty two CDSs, respectively, the ICEH 168 
consist of 20 CDSs (Additional file 5: Table S5). The 
organization of these elements is very similar. Some CDSs 
present similarity to tra genes, which are usually asso- 
ciated with the bacteria conjugative plasmids such as traK, 
tral> traE [26]. The ICEH168 has three tra genes, with one 
traG and two copies of the traE gene. Besides, a CDS 
encoding for a single strand binding protein (SSB) that is 
essential for the transfer process is also observed. 

The ICE analysis of three M. hyopneumoniae genomes 
(7448, J and 232) carried out previously, revealed that 
the ICEH is present in the two pathogenic strains (7448 
and 232) but is absent from the non-pathogenic one 
(J strain) [26]. Interest has therefore shifted to questions 
of whether the ICEH is present in the attenuated vaccine 



strain 168-L. Interestingly, the ICEH was also observed in 
strain 168-L. Moreover, the ICEH168 and ICEH168-L are 
almost the same, except for a missense mutation (G192E) 
identified in ICEH-ORF3 (MHP168_235). Our analyses in- 
dicate that the ICEH may not only present in pathogenic 
strains of M. hyopneumoniae. 

Mutations affecting epithelium adhesion 

In our previous study, the ability of adherence and dam- 
age to the cilia between strains 168 and 168-L were 
compared by using scanning electron microscopy. The 
results showed that the pathogenic strain 168 adheres to 
cilia inducing tangling, clumping, and longitudinal split- 
ting of cilia, while the strain 168-L does not cause ciliary 
damage comparing to control group [28]. The adherence 
of M. hyopneumoniae to porcine ciliated respiratory cells 
is essential for the organism to colonize the respiratory 
epithelium and cause pneumonia [4]. The adherence 
process is mainly mediated by receptor-ligand interac- 
tions, and the M. hyopneumoniae proteins possibly 
involved in these interactions are obvious candidates as 
virulence factors [8,29-31]. We investigated the genetic 
variation between strains 168 and 168-L (Table 1; 
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Table 1 Complete list of "168-L-specific" genetic variations in CDSs 



168-L gene 


168-L gene product 


Variation 


168 Locus 


Effect on 168-L coding 


MHP168L_ 


_010 


Putative uncharacterized protein 


SNV 


MHP16J 


3_010 


L368A substitution 


MHP168L_ 


_014 


Fructose-bisphosphate aldolase 


Deletion+SNV 


MHP16c 


3_014 


N-terminal deletion+SNV 


MHP168L_ 


_020 


ABC transporter ATP-binding protein 


Deletion 


MHP16c 


3_020 


264-aa deletion 


MHP168L_ 


_022 


Putative uncharacterized protein 


SNV 


MHP16J 


3_022 


S22S substitution 


MHP168L_ 


_030 


Topoisomerase IV subunit A 


SNV 


MHP16c 


3_030 


K140E substitution 


MHP168L_ 


_033 


Predicted protein 


Deletion 


MHP16c 


3_033 


N-terminal deletion 


IVinr 1 Ool__ 


_Ujj 


hypothetical protein 


SNV 


mupi a; 

IVInr I Dc 


^ n^ 


K177R,A221T substitution 


MHPIfi^l 
IVinr I OOL_ 


_UJO 


Glycyl-tRNA synthetase 


SNV 


MHP1 R 
ivin r i oc 


J_UJO 


P226P substitution 


IVinr I OOL_ 


JJO^H 


hypothetical protein 


SNV 


MHPI R 
ivin r i oc 




Y552H substitution 


MHP1 f\R\ 
ivin r i uol_ 


065 


Predicted protein 


Deletion 


MHP1 f\S 
ivin r i oc 


3 065 


Frameshift out 20aa 


IVInr I Ool__ 


_UDD 


hypothetical protein 


Insertion+SNV 


IVinr 1 Oc 


3 Ofsfs 

3_UOO 


Frameshift out 42aa; F324S substitution 


IVInr I Ool__ 




Chaperone protein dnaK 


SNV 


MHP1 f>S 
IVinr I Oc 


3_uoy 


P407P substitution 


MUpi AQI 

ivmr i ooi__ 


_uo^- 


Amino acid permease 


SNV 


MHP1 f\i 
ivmr i oc 


3 09.A 


R13K,T396A substitution 


MHP1 AQI 

ivmr i ooi__ 


_UCO 


NADH oxidase 


Insertion+SNV 


MHP1 f\i 
ivmr i oc 


j_UOJ 


"TG" insertion; stop; 5aa truncation; E395G 
substitution 


MHP168L_ 


_086 


Thymidine phosphorylase 


SNV 


MHP16J 


3_086 


V15F substitution 


MHP168L_ 


J 03 


Outer membrane protein-P95 


SNV 


MHP16J 


3_103 


Stop;149-aa truncation 


MHP168L_ 


J 05 


ATP-dependent protease binding protein 


SNV 


MHP16J 


3_105 


11591 substitution 


MHP168L_ 


_1 10 


Protein p97, cilium adhesin 


SNV 


MHP16J 


3_110 


SNV in repeat region 


MHP168L_ 


J 14 


50S ribosomal protein L2 


SNV 


MHP16J 


3J14 


A2A substitution 


MHP168L_ 


J 27 


50S ribosomal protein L15 


SNV 


MHP16J 


3_127 


E95K substitution 


MHP168L_ 


J 42 


Phosphopentomutase 


SNV 


MHP16J 


3_142 


N8I substitution 


MHP168L_ 


J 52 


ribulose-phosphate 3-epimerase 


Deletion+SNV 


MHP16c 


3_1 52 


121 1F substitution 


MHP16RI 

i vii i r i uol_ 


167 


L-lactate dehydrogenase 


SNV 


MHP16r ( 
ivii i r i uc 


3 167 


N204D substitution 


MHP1 AQI 

ivmr i ooi__ 


1 

_ I DO 


Hexosephosphate transport protein 


SNV 


MHP1 f\i 
ivmr i oc 


j_ I oo 


Q122S substitution 


MHP16RI 
ivii i r i uoi__ 


1 82 


Putative uncharacterized protein 


Deletion 


MHP16r ( 

ivii i r i uc 


3 1 82 


Frameshift;105-aa truncation 


MHP16RI 
ivii i r i uoi__ 


1 86 


Pyruvate dehydrogenase E1 -alpha subunit 


SNV 


MHP16r ( 

ivii i r i uc 


3 1 86 


S194G substitution 


MUpi AQI 
ivmr i ooi__ 


1 R7 
_ I 0/ 


Adenine phosphoribosyltransferase 


SNV 


MHP1 f\i 
ivmr i oc 


3 1 £R 
J_ I oo 


534-aa N-terminal extension 


mudi aoi 
IVinr I DoL_ 


1 Q8 


Protein P102 


SNV 


MHP1 f>S 
ivmr i oc 


3 1 QR 
3_ I yo 


L677L substitution 


ivin r i ooi__ 


_ZU7 


ABC transporter ATP-binding protein 


SNV 


MHP1 R 
ivin r i oc 


j_Z\jy 


K258E substitution 


MHP1 6RI 
ivin r i uol_ 


212 


Oligopeptide transport system permease 
protein 


SNV 


MHP1 f\S 
ivin r i oc 


3 212 


P203S substitution 


MHP168L_ 


_235 


Putative ICEF Integrative Conjugal Element-ll 


SNV 


MHP16c 


3_235 


G192E substitution 


MHP168L_ 


_243 


Serine hydroxymethyltransferase 


Insertion 


MHP16J 


3_243 


No change 


MHP168L_ 


_264 


ISMHpl transposase 


SNV 


MHP16J 


3_264 


S232P,C229R substitution 


MHP168L_ 


_275 


lipoate-protein ligase A 


SNV 


MHP16c 


3_275 


A179T substitution 


MHP168L_ 


_284 


Cobalt import ATP-binding protein cbiO 1 


SNV 


MHP16c 


3_284 


K64E substitution 


MHP168L_ 


_289 


Cation-transporting P-type ATPase 


SNV 


MHP16c 


3_289 


E100D substitution 


MHP168L_ 


_308 


putative ABC transporter ATP-binding protein 


SNV 


MHP16c 


3_308 


T708A,M700L substitution 


MHP168L_ 


_311 


Putative uncharacterized protein 


SNV 


MHP16c 


3_311 


D266E substitution 


MHP168L_ 


_312 


Putative uncharacterized protein 


Insertion+SNV 


MHP16c 


3_312 


lnsertion;R49RJ247L substitution 


MHP168L_ 


_314 


Putative uncharacterized protein 


Insertion+deletion 


MHP16c 


3_314 


Stop;14-aa truncation 


MHP168L_ 


_322 


hypothetical protein 


Insertion 


MHP16c 


3_322 


Frameshift;5-aa truncation 


MHP168L_ 


_345 


Predicted protein 


Deletion 


MHP16c 


3_345 


Frameshift;53-aa truncation 


MHP168L_ 


_355 


Putative uncharacterized protein 


Insertion 


MHP16c 


3_355 


21 -aa N-terminal deletion 



Liu et al. BMC Genomics 2013, 14:80 
http://www.biomedcentral.eom/1 471 -21 64/1 4/80 



Page 6 of 13 



Table 1 Complete list of "168-L-specific" genetic variations in CDSs (Continued) 



MHP168L_ 


_361 


hypothetical protein 


SNV 


MHP16 


>8_361 


L265L substitution 


MHP168L_ 


_377 


Putative uncharacterized protein 


SNV 


MHP16 


:8_377 


E407K substitution 


MHP168L_ 


_378 


P60-like lipoprotein 


SNV 


MHP16 


>8_378 


S143N substitution 


MHP168L_ 


_379 


HIT-like protein 


SNV 


MHP16 


>8_379 


D58N substitution 


MHP168L_ 


_381 


hypothetical protein 


Insertion 


MHP16 


>8_381 


Frameshift 


MHP168L_ 


_386 


P37-like ABC transporter substrate-binding 
lipoprotein 


Insertion 


MHP16 


■8_386 


Frameshift out 84aa 


MHP168L_ 


_389 


Putative membrane lipoprotein 


SNV 


MHP16 


>8_389 


A265S substitution 


MHP168L_ 


_392 


lipoprotein 


SNV 


MHP16 


>8_392 


T6M substitution 


MHP168L_ 


_394 


ABC transporter permease protein 


SNV 


MHP16 


>8_394 


A492G substitution 


MHP168L_ 


_400 


Ribonuclease III 


SNV 


MHP16 


■8_400 


V58I substitution 


MHP168L_ 


_401 


Putative uncharacterized protein 


Deletion 


MHP16 


>8_401 


Frameshift out 133aa 


MHP168L_ 


_409 


Putative type III restriction-modification system: 
methylase 


Deletion 


MHP16 


>8_409 


Frameshift out 171aa 


MHP168L_ 


_412 


ISMHpl transposase 


SNV 


MHP16 


>8_412 


G50V substitution 


MHP168L_ 


_413 


ABC transporter ATP-binding protein 


SNV 


MHP16 


>8_413 


D1 18Y substitution 


MHP168L_ 


_423 


Putative uncharacterized protein 


SNV 


MHP16 


>8_423 


Y18S substitution 


MHP168L_ 


_424 


Lppt protein 


SNV 


MHP16 


>8_424 


L814F substitution 


MHP168L_ 


_434 


hypothetical protein 


SNV 


MHP16 


>8_434 


W107C substitution 


MHP168L_ 


_444 


Putative uncharacterized protein 


SNV 


MHP16 


>8_444 


V123A substitution 


MHP168L_ 


_445 


Putative uncharacterized protein 


Deletion 


MHP16 


>8_445 


3-aa deletion in repeat region 


MHP168L_ 


_454 


Putative uncharacterized protein 


Insertion+SNV 


MHP16 


>8_454 


L423I substitution 


MHP168L_ 


_455 


Predicted protein 


Deletion+SNV 


MHP16 


>8_455 


59-aa extension;P51 P substitution 


MHP168L_ 


_456 


Putative uncharacterized protein 


SNV 


MHP16 


>8_456 


substitution 


MHP168L_ 


_457 


Putative uncharacterized protein 


Insertion+deletion 
+SNV 


MHP16 


>8_457 


Frameshift;substitution 


MHP168L_ 


_462 


ABC transporter ATP-binding protein 


SNV 


MHP16 


>8_462 


V219L substitution 


MHP168L_ 


_463 


ABC transporter permease protein 


SNV 


MHP16 


>8_463 


Amino acid substitution 


MHP168L_ 


_473 


hypothetical protein 


Insertion 


MHP16 


>8_473 


Frameshift out 84aa 


MHP168L_ 


_482 


Phosphoenolpyruvate protein 
phosphotransferase 


SNV 


MHP16 


>8_482 


E562G substitution 


MHP168L_ 


_490 


hypothetical protein 


Insertion 


MHP16 


>8_490 


Stop;1 14-aa truncation 


MHP168L_ 


_498 


Putative uncharacterized protein 


SNV 


MHP16 


>8_498 


D137E substitution 


MHP168L_ 


_503 


P216 surface protein 


Insertion 


MHP16 


>8_503 


Q repeat insertion 


MHP168L_ 


_504 


P159 membrane protein 


SNV 


MHP16 


>8_504 


G403D,L375S, G240A substitution 


MHP168L_ 


_505 


YX1 


SNV 


MHP16 


>8_505 


V175I substitution 


MHP168L_ 


_506 


Putative uncharacterized protein 


SNV 


MHP16 


>8_506 


N16N substitution 


MHP168L_ 


_507 


asparagine synthetase A 


SNV 


MHP16 


>8_507 


I85M substitution 


MHP168L_ 


_510 


Oligopeptide transport system permease 
protein 


SNV 


MHP16 


>8_5 1 0 


S19S substitution 


MHP168L_ 


_523 


Xylose ABC transporter ATP-binding protein 


SNV 


MHP16 


>8_523 


S8S substitution 


MHP168L_ 


_531 


Putative uncharacterized protein 


Deletion+SNV 


MHP16 


:8_531 


Amino acid substitution 


MHP168L_ 


_541 


Predicted protein 


SNV 


MHP16 


>8_541 


L50L substitution 


MHP168L_ 


_557 


Potassium uptake protein 


SNV 


MHP16 


>8_557 


D69Y substitution 


MHP168L_ 


_558 


Potassium uptake protein 


SNV 


MHP16 


>8_558 


H425Y,N477D substitution 


MHP168L_ 


_559 


hypothetical protein 


SNV 


MHP16 


>8_559 


D1236G substitution 


MHP168L_ 


_567 


hypothetical protein 


SNV 


MHP16 


>8_567 


D68N substitution 
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Table 1 Complete list of "168-L-specific" genetic variations in CDSs (Continued) 



MHP168L_ 


_571 


hypothetical protein 


Insertion+deletion 
+SNV 


MHP16; 


3_571 


Frameshift;amino acid substitution 


MHP168L_ 


_573 


Putative uncharacterized protein 


Deletion+SNV 


MHP16J 


3_573 


Amino acid substitution;N-terminal extension 


MHP168L_ 


_576 


ISMHpl transposase 


SNV 


MHP16J 


3_576 


Amino acid substitution 


MHP168L_ 


_589 


Membrane nuclease, lipoprotein 


SNV 


MHP16c 


3_589 


S98S substitution 


MHP168L_ 


_596 


Glycerol-3-phosphate dehydrogenase 


SNV 


MHP16J 


3_596 


L155V substitution 


MHP168L_ 


_600 


ABC transporter ATP binding protein 


SNV 


MHP16J 


3_600 


P1014L,P1037L substitution 


MHP168L_ 


_606 


hypothetical protein 


Deletion+SNV 






Frameshift;amino acid substitution 


MHP168L_ 


_614 


ABC transporter xylose-binding lipoprotein 


SNV 


MHP16J 


3_614 


151V substitution 


MHP168L_ 


_621 


hypothetical protein 


SNV 


MHP16J 


3_621 


G87G substitution 


MHP168L_ 


_631 


ABC transporter ATP-binding-Prl 


SNV 


MHP16J 


3_631 


R60W substitution 


MHP168L_ 


_638 


Putative uncharacterized protein 


Insertion+SNV 


MHP16J 


3_638 


K insertion in K repeat region;K7K substitution 


MHP168L_ 


_639 


5'-nucleotidase precursor 


SNV 


MHP16c 


3_639 


E41 1K substitution 


MHP168L_ 


_666 


Ribosomal RNA small subunit 
methyltransferase G 


Deletion 


MHP16J 


3_666 


Frameshift out 12aa 


MHP168L_ 


_668 


Prolipoprotein p65 


SNV 


MHP16c 


3_668 


T138A substitution 


MHP168L_ 


_671 


XAA-PRO aminopeptidase 


SNV 


MHP16c 


3_671 


G321G substitution 


MHP168L_ 


_672 


Putative uncharacterized protein 


SNV 


MHP16J 


3_672 


E104E substitution 


MHP168L_ 


_673 


Putative uncharacterized protein 


SNV 


MHP16c 


3_673 


I52K,E127K substitution 


MHP168L_ 


_675 


hypothetical protein 


SNV 


MHP16c 


3_675 


Y175D substitution 


MHP168L_ 


_676 


P146 adhesin like-protein, p97 paralog 


Insertion+SNV 


MHP16J 


3_676 


Q insertion in PQ repeat region;S404S,W404R 
substitution 


MHP168L_ 


_688 


Putative uncharacterized protein 


SNV 


MHP16J 


3_688 


K48N substitution 


MHP168L_ 


_698 


hypothetical protein 


SNV 


MHP16J 


3_698 


R126G,V159G substitution 


MHP168L_ 


_707 


hypothetical protein 


Insertion 


MHP16c 


3_707 


"F" insertion 


MHP168L_ 


_747 


hypothetical protein 


Insertion+deletion 
+SNV 


MHP16c 


3_091 


N repeat insertion;39-aa N-terminal extension; 
SNV 


MHP168L_ 


_748 


Protein P102 


Deletion+SNV 


MHP16c 


3_108 


Stop;782-aa truncation;SNV 


MHP168L_ 


_749 


Putative type III restriction-modification system: 
methylase 


Deletion 


MHP16J 


3_730 


N-terminal deletion 


MHP168L_ 


_750 


hypothetical protein 


Deletion 


MHP16J 


3_435 


Frameshift 


MHP168L_ 


_r002 


16S ribosomal RNA 


SNV 


MHP16J 


3_r002 


Amino acid substitution 


MHP168L_ 


_t027 


tRNA-Ser 


SNV 


MHP16c 


3_t027 


Amino acid substitution 



Additional file 6: Table S6), and compared mutations 
affecting CDSs corresponding to previously described 
mycoplasma adhesins (P97, P102, P146, P159, P216, 
MgPa, LppS, and LppT) [7]. Notably, almost all the 
reported mycoplasma adhesins are included in the CDSs 
affected by mutations (Table 1). 

In 168-L, three transversions were identified in the Rl 
region, near the C-terminus, of P97 (MHP168_110/ 
MHP168L_110), which encodes cilium adhesin. In M. 
hyopneumoniae, attachment to the respiratory epithe- 
lium is mainly mediated by the membrane protein P97 
[4]. This protein is located on the outer membrane sur- 
face, and its role in adherence has been firmly estab- 
lished. To bind cilia, a minimum of eight tandem copies 
of the pentapeptide sequence (AAKPV/E) in Rl are 



required [5]. Notably, all three transversion mutations 
were located in the tandem repeat unit (AAKPV/E), 
causing an E863V substitution. Significant alteration in 
this critical repeat unit might partly affect the adhesion 
reaction in 168-L. 

Previous studies have demonstrated that P102 binds 
fibronectin and contributes to the recruitment of plas- 
minogen) to the M. hyopneumoniae cell surface [15]. 
P102 is commonly linked to P97 cilium adhesin, forming 
a two-gene operon [32], Both P97 and P102 have several 
paralogs within the M. hyopneumoniae genome. How- 
ever, the paralogs have only part of the complete se- 
quence. Interestingly, P102, the companion gene in this 
operon, was truncated at 564 bp by a single base inser- 
tion in strain 168. Another intact copy of P102 (99% 
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identity) was found 85 kb from this operon. Conversely, 
in strain 168-L, the original truncated P102 was reverted 
to the intact one, while another intact copy of P102, 85 
kb away, was truncated. 

The P146 adhesin-like protein of M. hyopneumoniae 
shows strong similarity to the LppS lipoprotein of 
Mycoplasma conjunctivae, which is involved in in vitro ad- 
hesion [16]. In addition, the N-terminus region of P146 
also shows strong similarity to the P97 adhesin, and has a 
strongly hydrophobic region (amino acids 7-29), indicat- 
ing a transmembrane region, and suggesting that the 
protein is expressed on the surface of M. hyopneumoniae 
cells [33]. Compared with its counterpart MHP168_676 in 
168, MHP168L_676 from 168-L has an in-frame insertion 
of one amino acid (Q) at the N- terminus of P146. The 
enormous intra-specific diversity shown for the P146 en- 
coding gene is at least partly because of differences be- 
tween several repeat regions present in the gene, most 
notably a polyserine chain of variable length, and a [Q] n 
[(P/S)Q] m repeat region [34]. Interestingly, this one in- 
frame insertion (Q) was located in the [Q] n [(P/S)Q] m 
repeat region. Polyserine chains often function as a spacer 
region in proteins involved in complex carbohydrate deg- 
radation [35], while sequences rich in both proline and 
glutamine are not uncommon and can form a conform- 
ation known as a polyproline II helix [36,37]. Such 



proline-rich sequences are often involved in binding pro- 
cesses and are highly immunogenic [37]. However, be- 
cause the function of the P146 protein remains unknown, 
correlations with virulence or adhesion are speculative 
and need further investigation. 

PI 59 is a proteolytically processed surface adhesin of 
M. hyopneumoniae [17]. Three proteins with apparent 
molecular masses of 27 (P27), 52 (P52), and 110 (P110) 
kDa were identified through proteomic analysis of M. 
hyopneumoniae lysates [17], with each representing a 
different region spanning PI 59. These cleavage frag- 
ments are located on the cell surface and present at all 
growth stages. In 168-L, MHP168L_504 (P159) has a 
missense mutation resulting in a G240A replacement in 
the (S)(S)G(G)S repeat region of P159. Although this (S) 
(S)G(G)S repeat region has been reported, its biological 
function is unknown. 

P216 (MHP168L_503/MHP168_503) is a proteolytic- 
ally processed cilium and heparin binding protein of M. 
hyopneumoniae [18]. This surface protein is post- 
translationally processed to generate N- terminal PI 20 
and C-terminal P85 fragments, both of which can bind 
cilia [18]. The 168-L P216 gene has an in-frame four 
amino acid deletion in a poly Q motif near the C- 
terminus. Previous studies have suggested that poly Q 
and KEKE motifs may play a role in maintaining P85 on 
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the cell surface [18,34]. Collectively, the deletion muta- 
tions affecting P216 may affect its cilium adhesion and 
may be associated with virulence attenuation in 168-L. 

The MHP168L_424 gene and its gene product LppT 
were analyzed in detail because they showed approxi- 
mately 22% identity to the LppT protein from M. 
conjunctivae. LppT is the second gene in a two gene op- 
eron with LppS, which was reported to be an adhesin in 
M. conjunctivae [16]. The LppT gene lacked a promoter 
and is likely to be co-transcribed with LppS, thus sug- 
gesting a functional relationship between LppS and 
LppT [16]. In M. hyopneumoniae, LppT encoded a pro- 
tein of 954 aa with a calculated molecular mass 108 kDa. 
The gene product encoded by LppT is also a membrane 
protein, with a signal sequence of 34 aa at the amino- 
terminal end and a transmembrane structure. Notably, 
one of the amino acid substitutions in 168-L occurs near 
the C-terminus of LppT, resulting in a L814F replacement. 

Mutations altering the cell envelope and genes encoding 
secreted proteins 

Cell envelope proteins and secreted proteins are involved 
in virulence, host cell interaction, and immune responses 
[14,38,39]. Outer membrane protein-P95 is a cell envelope 
protein in M. hyopneumoniae. In 168-L, MHP168L_103 
(P95) is truncated by a nonsense mutation compared to 
168, resulting in an E965* termination near the C- 
terminus of P95. Significant alteration in this outer mem- 
brane protein could conceivably cause a truncation in 
coding region, and in turn alter the function of P95 outer 
membrane protein. 

M. hyopneumoniae contains an abbreviated membrane 
protein secretory system [1]. The pathway consists of secA 
(MHP168L_088), secY (MHP168L_128), secD (MHP168L_ 
259), prsA (MHP168L_664), dnaK (MHP168L_069), trigger 
factor (MHP168L_154), and lepA (MHP168L_076). It 
has recently been demonstrated that some pathogenic 
bacteria use a type IV secretion system, composed of 
subunits related to the conjugation machinery, to de- 
liver effector molecules to host cells [40], and that this 
system may be involved in pathogenesis [41]. We found 
no pathogenic mutations in the protein secretory 
system, except a synonymous substitution (P407P) in 
MHP168L_069, which encodes chaperone protein DnaK. 

Mutations affecting antigens 

Studies on the antigenic properties of M. hyopneumoniae 
revealed several immunodominant proteins, including the 
P36 cytosolic protein [12,42], the P46, P65, and P74 
membranous proteins [43-46], the elongation factor Tu 
[47], the chaperone protein DnaK [47], the pyruvate de- 
hydrogenase El-beta subunit [47], and the P97 adhesin 
[4]. The functions of these proteins have not been well 



elucidated, but specific reactants may eventually be useful 
tools to diagnose M. hyopneumoniae [48] . 

The cytosolic P36 protein is a lactate dehydrogenase 
[49] that induced an early immune response in pigs 
that are experimentally and naturally infected by M. 
hyopneumoniae [50]. Comparative studies with other 
Mycoplasmas commonly found in pigs demonstrated 
that the P36 proteins carry highly conserved species- 
specific antigenic determinants for M. hyopneumoniae 
[42]. Hyperimmune sera produced against recombinant 
P36 protein showed no reactivity against other porcine 
Mycoplasmas, including M. flocculare, M. hyorhinis, 
and Acholeplasma laidlawii [12]. Notably, one of the 
observed amino acid substitutions in 168-L occurs near 
the C-terminus of P36 (MHP168L_167), resulting in a 
N204D replacement. 

P65 is an immunodominant surface lipoprotein of M. 
hyopneumoniae that is specifically recognized during dis- 
ease [14]. Analysis of the translated amino acid sequence 
of the gene encoding p65 revealed similarity to the 
GDSL family of lipolytic enzymes [14]. The monospecific 
antibodies against heat shock protein-like P42 antigen, 
part of P65, can block the growth of M. hyopneumoniae 
[51]. In 168-L, MHP168L_668 (P65) has a missense mu- 
tation resulting in a T138A replacement. 

Mutations affecting transport proteins 

As Mycoplasmas are dependent on the exogenous sup- 
ply of many nutrients, it has been predicted that they 
may need many transport systems [3]. Motif analysis 
revealed a family of proteins with a phosphotransferase 
(PTS) motif. The open reading frames (ORFs) included 
sgaA (MHP168L_422), sgaB (MHP168L_421), sgaT (MHP 
168L_563), mtlF (MHP168L_739), mtlA (MHP168L_561), 
nagE (MHP168L_582), and UcA (MHP168L_041) [7]. 
However, no mutations were identified in this PTS 
transporter family. There are approximately 30 genes 
with ABC transporter family signatures in the genome 
of M. hyopneumoniae 168-L (Additional file 7: Table S7), 
and five missense mutations and one synonymous substi- 
tution were identified in this group. These included a co- 
balt import ATP-binding protein (MHP168L_284, K64E), 
an ABC transporter permease protein (MHP168L_394, 
A492G), a xylose ABC transporter ATP-binding protein 
(MHP168L_523, S8S), and three ABC transporter ATP- 
binding proteins (MHP168L_4±3, D118Y; MHP168L_462, 
V219L; MHP168L_631, R60W). Interestingly, the expres- 
sion of MHP168L_394 and MHP168L_413 was reported 
to be up-regulated in vivo during disease relative to 
in vitro-grown [11]. The variability between strains 168 
and 168-L in multi-transport proteins indicates that 
they may affect growth and survival in different hosts or 
host tissues. 
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Mutations affecting genes directly related to metabolism 
and in vivo growth 

M. hyopneumoniae strain 168 encodes 695 genes, approxi- 
mately one quarter of which are involved in metabolism 
and in vivo growth. Of particular interest were the muta- 
tions observed in the genes involved in various metabolic 
pathways (Figure 3), including glycolysis/gluconeogenesis 
(MHP168_167, MHP168_186), purine metabolism (MHP 
168_289, MHP168_639), pyrimidine metabolism (MHP 
168_086), glycerophospholipid metabolism (MHP168_596), 
oxidative phosphorylation (MHP168_085), aminoacyl- 
tRNA biosynthesis (MHP168_058), and the pentose phos- 
phate pathway (MHP168_142, MHP168_152). An in-frame 
insertion of two amino acids (TG), a missense mutation 
(E393G) and a nonsense mutation (N456*) were identified 
in 168-L near the C-terminus of MHP168L_085, which 
encodes a NADH oxidase involved in oxidative phosphoryl- 
ation. Mycoplasma genomes are deficient in genes coding 
for components of intermediary and energy metabolism [3]. 
Thus, Mycoplasmas depend mostly on glycolysis to 
synthesize ATP [3]. In the glycolysis pathway, missense 
mutations in both L-lactate and pyruvate dehydrogenase 
were observed, resulting in N204D and S194G replace- 
ments, respectively. Iron deprivation, is a prominent feature 
of the host innate immune response, and most certainly 
impacts growth of Mycoplasmas in vivo [52]. Through 
transcriptome analysis, MHP168_639 was identified to be 
down-regulated during iron limiting conditions [52]. This 
suggests that MHP168_639 may play a role in M. hyopneu- 
moniae^ response to iron stress. In 168-L, MHP168L_639 
has a missense mutation resulting in an E411K replace- 
ment. Mutations in these metabolism-related genes accu- 
mulated over 300 in vitro passages likely affect growth and 
survival within host cells. 

In addition, approximately 41% of mutations affected 
genes coding for hypothetical proteins. Despite the lack 
of functional annotations for these genes, their disrup- 
tion in 168-L makes them obvious targets for investiga- 
tion as potential virulence factors. Further molecular 
genetics and in vivo studies are required to confirm and 
assess the relative importance of these genes in the at- 
tenuation of virulence in 168-L. 

Conclusions 

We successfully used a combination of sequencing gen- 
omics and comparative genomics strategies to provide a 
comprehensive analysis of virulent and attenuated M. 
hyopneumoniae strains to identify determinants involved 
in pathogenesis. The genome of the attenuated high- 
passage derivative strain 168-L was sequenced and com- 
pared to virulent strain 168, revealing mutations in 
numerous CDSs. These mutations affected CDSs are 
likely to be associated with virulence. We then compared 
these putative virulence factor CDSs to reported 



virulence determinants. Notably, almost all of the reported 
M. hyopneumoniae virulence determinants were included 
in the list of putative virulence factor CDSs. Variations in 
the previously described mycoplasma adhesins (P97, P102, 
P146, P159, P216 and LppT), cell envelope proteins (P95), 
cell surface antigens (P36), secreted proteins, chaperone 
protein (DnaK), and genes directly related to metabolism 
and in vivo growth may contribute to loss of virulence in 
168-L. We then proceeded to characterize the alterations 
in gene functions caused by mutations at the protein level, 
and compared those mutations with previously described 
repeat motifs that may be of primary importance for viru- 
lence [34]. Interestingly, we found that many mutations 
were located in the virulence associated motifs of the vari- 
ous proteins. To bind cilia, a minimum of eight tandem 
copies of the pentapeptide sequence (AAKPV/E) in the Rl 
region of P97 are required [5]. We identified three muta- 
tions in the tandem repeat unit (AAKPV/E), causing an 
E863V substitution. A similar situation was also observed 
in several other virulence associated genes (P146, P159, 
P216, and LppT). We hypothesize that the cumulative 
effect of mutations in virulence associated genes may ac- 
count for the attenuation of virulence in 168-L. In this 
study, a total of 330 genetic variations were identified. 
While these included reported virulence-related genes, 
other novel virulence determinants were also identified. 
However, further molecular genetics and in vivo studies 
are required to confirm and assess the relative importance 
of these suspected novel virulence determinants in the at- 
tenuation of virulence. The comparative genomic analysis 
presented here will not only provide insights into the basis 
of attenuation of virulence in 168-L, but may also provide 
targets for mutagenesis in the pursuit of development of a 
more efficacious vaccine. 

Methods 

Bacterial strains, growth conditions, and DNA extraction 

Clonal isolates of M. hyopneumoniae strain 168 and 
168-L were selected for sequencing. Both of the strains 
were grown in KM2 cell-free medium at 37°C. The cul- 
ture was harvested from 100 mL KM2 cell-free medium 
by centrifugation at l,200xg for 30 min, and then total 
genomic DNA was extracted from mycoplasma cultures 
using a TIANamp Bacteria DNA Kit (Tiangen, Beijing, 
China) according to the manufacturer s instructions. 

Genome sequencing and assembly 

Genomic libraries containing 8 kb inserts were con- 
structed according to the manufacturers protocols. 
Whole-genome sequencing of strain 168-L was per- 
formed by combining GS FLX and Solexa paired-end 
sequencing technologies. A total of 242,507 reads (67.4% 
paired ends) were produced with the GS FLX system, 
giving 44.5-fold coverage of the genome. Eighty-eight 
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percent (215,346) of reads were assembled into one large 
scaffold using Newbler (454 Life Sciences, Branford, CT, 
USA). A total of 1,971,358 reads were generated with an 
Illumina Solexa genome analyzer IIx (Illumina, San 
Diego, CA, USA) and were mapped to the scaffold with 
the Burrows- Wheeler Alignment (BWA) tool [53]. Gaps 
were filled by local assembly of the Solexa/Roche 454 
reads or sequencing PCR products using a Prism 3730 
capillary sequencer (Applied Biosystems, Foster City, 
CA, USA). All repeated DNA regions and low-quality 
regions were verified by PCR and sequencing of the 
product amplified from genomic DNA. 

Annotation and sequence analyses 

Open reading frames containing more than 30 amino acid 
residues were predicted using Glimmer 3.0 [54] with 
modified genetic code 4 and verified manually using the 
strain 168 annotation. Loci discrepancies between the 168 
and 168-L consensus sequences were manually examined 
for support at the trace data level. Transfer RNA (tRNA) 
and ribosomal RNA (rRNA) genes were predicted using 
the tRNAscan-SE program [55] or by observing similar- 
ities with the M, hyopneumoniae strain 232 and strain J 
rRNA genes. Artemis (release 12) [56] was used to collate 
and annotate data. Functional predictions were based on 
BLASTP similarity searches against the UniProtKB [57], 
GenBank [58], Swiss-Prot protein [59], and COG [60] 
databases. EC numbers were assigned using the Kyoto 
Encyclopedia of Genes and Genomes (KEGG) [67] and 
metabolic pathways were mapped and analyzed using 
KEGG Pathway Database (http://www.genome.jp/kegg/ 
pathway.html). Pseudogenes were detected by BLASTN 
analysis, comparing the genome sequences of 168-L with 
those of 232 and J, and then the annotation was revised 
manually. 

Single nucleotide polymorphism (SNP) analysis 

Nucleotide comparisons and single nucleotide poly- 
morphism (SNP) analysis for strains 168 and 168-L were 
performed using the Artemis Comparison Tool (ACT) 
[61] and Mauve 2.3.1 genome alignment software [62]. 
ORF graphical visualization and manual annotation were 
carried out using Artemis, release 12 [56]. Screening for 
unusual coding differences between the 168 and 168-L 
genomes (stops and frame shifts) was conducted using 
FASTA program packages [63,64] and BLAST [65]. The 
coding differences between the 168 and 168-L genomes 
were checked manually. 

Accession numbers 

M. hyopneumoniae strains 168 and 168-L genome seq- 
uences have been deposited in GenBank under accession 
numbers CP002274.1 and CP003131, respectively. 
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